-
Notifications
You must be signed in to change notification settings - Fork 478
Fixes bug with getting disk usage of the ROOT table #5479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code did not consider ROOT table, and was always returning 0 for the disk usage as it would scan the ROOT table if the input was the METADATA table and would scan the METADATA table otherwise. Should have been scanning ZK if the input was ROOT. Changed to use Ample instead, which handles this issue nicely.
|
Working on fixing the unit test which was mocking based on old impl |
|
I'm guessing it reported zero for the disk usage because the ROOT table is stored in ZooKeeper, not HDFS. Not suggesting your fix is not applicable, just providing possible background. |
dlmarion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Isn't it just the metadata for the files that's stored in ZK, not the files themselves? I think the files are written to HDFS even for ROOT, so "disk usage" still makes sense in this case |
|
The failing unit test had to essentially be completely rewritten. Up to date now |
server/base/src/main/java/org/apache/accumulo/server/util/TableDiskUsage.java
Outdated
Show resolved
Hide resolved
| final ServerContext client = EasyMock.createMock(ServerContext.class); | ||
| final Scanner scanner = EasyMock.createMock(Scanner.class); | ||
| mockScan(client, scanner, 1); | ||
| final ClientContext client = EasyMock.createMock(ClientContext.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more levels of mocking there is in test code, the harder it is to refactor code the cod under test. Sometimes its better to change the code to make it easy to test and to minimize the need for mocking. For example maybe could have changed :
public static Map<SortedSet<String>,Long> getDiskUsage(Set<TableId> tableIds,
AccumuloClient client) to :
public static Map<SortedSet<String>,Long> getDiskUsage(Set<TableId> tableIds,
Function<TableId, Iterable<Map<StoredTabletFile,DataFileValue>>> tableFilesProvider) Maybe this would make the code easier to test. Not a change for this PR as the changes w/ easy mock were already made. just something to consider for future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree the mocking that's done for this test now is definitely not ideal. I did not consider doing something like that, but you're right it would have made things a lot simpler for the test here. I'll be keeping this in mind for future changes, thanks!
server/base/src/test/java/org/apache/accumulo/server/util/TableDiskUsageTest.java
Outdated
Show resolved
Hide resolved
* Fixes bug with getting disk usage of the ROOT table Code did not consider ROOT table, and was always returning 0 for the disk usage as it would scan the ROOT table if the input was the METADATA table and would scan the METADATA table otherwise. Should have been scanning ZK if the input was ROOT. Changed to use Ample instead, which handles this issue nicely. * rewrote TableDiskUsageTest
Code did not consider ROOT table, and was always returning 0 for the disk usage as it would scan the ROOT table if the input was the METADATA table and would scan the METADATA table otherwise. Should have been scanning ZK if the input was ROOT. Changed to use Ample instead, which handles this issue nicely.
Discovered when working on #5474